What is Hortonworks?
- Hortonworks is a business computer software company focusing on the development and support of Apache Hadoop
- Hortonworks’ product named Hortonworks Data Platform (HDP)
- HDP includes Apache Hadoop and is used for storing, processing, and analyzing large volumes of data from many sources and formats
- Provides world-class support for Enterprise Hadoop from development to production
- Is an open and single platform for any data and any workload
HDP 2.x
The HDP (2.x) distribution increasingly incorporates many innovations in the Hadoop ecosystem
What is Hortonworks Sandbox?
- Hortonworks Sandbox is a personal, portable Hadoop environment that provides an easy and effective way to learn Enterprise Hadoop on a single-node cluster in a virtual machine
- It comes with a pre-configured image
HCatalog
- HCatalog is table and storage management layer for Hadoop
- It enables users with different data processing tools (Pig, MapReduce) to more easily read and write data
- HCatalog’s table abstraction presents user with a relational view of data in HDFS and ensures that users need not worry about where or in what format their data is stored
- By defauly, HCatalog supports RCFile, CSV, JSON, and SequenceFile, and ORC file formats
Ambari
A completely open framework for provisioning, managing and monitoring Apache Hadoop clusters
- Offers an intuitive collection of tools and APIs that mask the complexity of Hadoop, simplifying the operation of clusters no matter the size of the Hadoop cluster
- Features:
- Wizard-driven interface
- API-driven installations
- Granular service control
- Configuration change history
- Extensible framework
- Customizable user interface
- User views
- File Browser for accessing HDFS
- Metastore Browser for accessing Hive metadata and HCatalog
- Hive Editor for developing and running Hive queries
- Pig Editor for submitting Pig scripts
- …
Access the Linux Shell Web Client
- Start HDP Sandbox in VM and keep it running
- Go to http://127.0.0.1:4200/ to access the shell web client (Or use ssh client via
ssh root@127.0.0.1 -p 2222)
- Log in:
- Username:
root
- Password:
hadoop (very 1st time, change to a new password) or the updated password you set
Access Hortonworks Sandbox via Ambari
- Keep the Hortonworks Sandbox running in VM
- Go to http://127.0.0.1:8080/
- You may use the following users and passwords (that came with HDP) to log in
| raj_ops |
raj_ops |
For infrastructure build and R&D activities |
| maria_dev |
maria_dev |
For preparing and getting insight from data |
| holger_gov |
holger_gov |
For the management of data elements |
| amy_ds |
amy_ds |
For exploratory data analysis, cleanup and transformation |
Reset Ambari the admin Password
- Run this command at the prompt (127.0.0.1:4200)
ambari-admin-password-reset
- Give your new password for Ambari. It may take some time to complete. If your Ambari doesn’t restart automatically, restart ambari service with command:
ambari-agent restart
- Next time you can access Ambari with
- Username:
admin
- Password:
your new password
Using Ambari: the User Views
Ambari User Views: Files View
Click “Open” to Preview a File
Click “Permissions” to Change Permissions of a File
pending
Click “Download” to Download a File to Your Local OS (Mac or Windows)
Click “Upload” to Upload a File from Your Local OS (Mac or Windows)
Ambari User Views: Hive View There are Two Hive Views
Ambari User Views: Hive View
Prepare to Upload/Create a Table
- Change file permissions to 777 (often) in Files View
- Set right delimiter and import row header if any
Get a Description of a Table
Ambari User Views: Pig View